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,^BSTRRCT - • - , ^ : ' , 

^ . /T^^ ^se of the cloze procedure als a niet*hod of ' > 

detetrmining , relative, readability i#as first -inti;oducei; .b^ 
Tayl(y:, tfhOK contended that the cloze procedure /was a ''supe'rior method 
of determining re^dabijL^ty siiice it measured ^ facet of readability 
called concept load. Authorities :.have ^accepted cloise tests "^alid 
and reliab3,e instruments capable of measuring general comprehension ~ 
skills* Agreement nov e^xists as to the ^accepted aethodolpgy of ^^l^ze 
.test construction, a deletion systesi of every f ifth, word vith ^ 
standardized blank ler^gth has been acce^p ted, , although recent studies 
have sulimi-tted that blank leiS3gt fa is not crucial to the construction 
of cloze tests* Three, studies indicated'^ the suitability cloze 
procedure as a method of detersaning €he xn levels-of 
elementary- school pupils. In reviewing the literature^ aany research 
veal^nesses becoBe apparent. Many researchers failed to mention the 
procedures' vhich were employed in cloze test constructi^pn* Deletion 
system and scoring procedures are often despribed inadequate ly. Hore ' 
needs to be known about the construct yalidity of cloze tests>^^ 
Several affective components operating duririg the completion of a\^^ 
cloze test also need to be investigated* (HR) 
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us ocf^AftTMCfir or HEALTH, This oocu*>bNt ,has been ftEOHO^ :> ■ , 

COUCATION AWEtFARE OUCEO EX.'..CtLY AS ftCCOVEO PROM ■■ J 

*4ATVoHAHNSTITUIC OP iMC P€f?^ON 0«* OFr&AMl2ATlON Ot»>G>N " . 

' tOUCAffON , ATlNGSt POlNTSO*: V!€siV OM OPtNIONS ■ ' ' 

SENlOff«C>ALSAT'0NAL "NSTnUTEOr , Ptrm^l<^t.lOr< TO '^OJUCl 1^^* W^^^ 

Historical\ Overview of the Cloze Rroced^:ire . Richard 13. Walter- 

... ^KTUTt Cf EDUCATION FuH^I^£H nf.f^FHV 
: biJCnON- OUTSIDE THE FJ'iC SYSTEM HE- 
QUiHES PCnS^i^olON OF 'THE COP»'ffitC»iT 



\by Richard B. V/alter 
i^ean College of New JeVsey 



Many of the ^innovations in reading are npt based upon. enti^^j;el^ 
new cphcepts. . Rather, they are relatively similar to, methods which have 
preceded them » 'An example is the collection of processes used to -determine 

\ 

readability. All are methodologically "applied to the reading material,. ... 
•without a reader tein^ involved. Most of the^me.thods of determining / 
readability are concerned vjith the presence or ^^bsence of specif ic words/ 
in the reading selection, and with 'average sentence lengths Thns, it was 
with interest that the reading world vievred the nf:wly introduced clozi^ 
procedui^e. Here j.^as a method that determined the difficulty of reading^ 
material by ranking the selections ^in orde..";of difficulty. It also 
depended u;pon the presence of a reader. 

The CLLoze procedure was developed h;^ V/ilson L« Taylor in 19$3 



V7hen he v;as doing graduate wqrk at the University of Illinois Taylor 

applied the statistical principles of randomization to\he-^ mutilation of 

° \ 

a prose passaged This 5nathematical deletion of words (that is, every 

n^^ \:oTd^ was used as ah exercise to determine the success a reader / 

^would^Jiave in supplying the missing word^ Taylor designated" this the 

"cloze procedure •l*^ ' , \ ' * \ 

Taylor stated: • ^ ^ \ ^ 

It IS pronbiinced like the word '*close'* and is derived from ^ 
"closure." ;The last term is one Gestalt psychology applies 

i - ■ \ ^ -.^ - ■ 



/ 



. ''•H/ilson L» Taylor, "Cloze Procedure: A New Tool for ^Measuring 
Readability^" Journalism Quarterly , XXX (Fall,' 19$3)^ hih\ \ \ 



to the humari tendency to complete a fajidliar but ^ot-quite~ 
finished pattern --Hp "see" a broken qii*cle a whole one, 
fo;: example > by mentally closing up the Jjaps.^ 



■ ■ TvielyS years later, V/eaver^took issue^d.th Taylor's analogy- 
and claimed/that „the designation "clozel^'^vas misleading since many of 
the common .'constructs of Gestalt psycholog2r""cannot be easily applie'^d to' 
yerba§i« situations. 3 Nonetheless, the term survive^^. , . , 

Kodeis have long been popular in education. The'pxx)fession has 
used physioJo.;ical and psychologiCjal models j it is presentl^N^jeing 



influenced by -a computer and space age model, that of systems anaQ.y^is. /V 
Taylor elected to use an electronic model for his illustrative definijtion 
of the "clOze procedure. He stated: / 



Cloze pr^pcedure may be defined as:: a method of^nterrupting 

a message from a "transmitter" (VJriter or spe^i:er), mutilating ^ 



. its lan^^age patterns by deleting' parts, a^a so adndnistefing 
it to "receivers" (readers or listeners) ^so that their attempt 
to make the patterns whole again potentially yield a considerable 
number of cloze units. ^ ' ^ 



/Taylor defined cloze xmits as: 



Any single occurrence of a successful attempt' to reproduce accurately 
a part deleted, from a -'message" (any language pr:)duct) by deciding, 
from the context that remains, what the missing part should , be. 5 



V 



2ibid., p. ia^;. 

A/endell 17. VJeaver, "Theoretical Aspects of Cloze Procedure," 
Fourte^th Yearbook of the national Reading Conference ^ (Milwaukee 
national Reading Conference, Inc 1965) ^ pp^ 115-32. ; 

^^Taylory^ op^^^ cit > , pi Ul6. . • 



Initially, Taylor applied the cloze procedure to^the ta^k of 
deteni^iiiing the relative dii^'iculty of a selection of reading material. 
As pr^vxously stated, this was a rather unique approach to i*eadability, 
for, as Anderson stated, ^^Cloze as a readability technique involves the 

Tlie basip theory behind the use of this cloze procedxire as a 
method .of determining readability was siicply stated by Hafner: * **TKe 
individualfs choice ]^ of words^ is an index of his ability to. comprehend 
reading matter."^ ' 

- /V^ A -conclusion of Taylor's research was that the cloze procedure 
was a more valid predictor of reading difficulty^ than either thj^ Dale- 
<Chall or the Flesch formula* Taylor fou^d that these formulae could be 
"fooled" by authors' style and selected specific examples fr^JLiterature 
to demonstrate this. . Passages were .selected from the workis of Gertrude 

\ ■ . ■ , . " -. 

stein ''arid James Joyce • The writings of Stein an^ Joyce have, as .a common 
feature, pa9sages which are easily read in- terms of the lievel. of vocabulary^ 
but v/ith a high concept load. Taylor concluded that the Dale-Chall and 
the/ Flesch'-forraulae were not sensitive enough to ^any^^aspects of reading, 
since the application of these two formulae yielded very low. levels, of 



V ; "6j. Anderson, "Research in ReadaBillty for the Classrocan 
Teacher," Jouraal. of- Reading , X (Febiniary, 196? 291-99*^ 

^L. E. Hafner, "Cloze Procedure, Journal of Readi ng, 
IX (May, 1966),. hX$^21. 



..readability when applied to the passages bitten by Stein and Joyce ♦ Taylor 

\ ^ ■ ■■ ^ 

felt that the cloze -procedure did not suffer from thiS deficit and stated: 

" ' \. , ■'' ' ■ ■ ^"■'^•--^ 

Potentially important,, it seems, is the fact that a cloze score 
seems tobe a measure of i:. he .^aggregate influence of -all factors 
. wh^h interact to affect the"^ degree of correspondence between the 
.:; ' l^angaage patterns of transmi'tter and reCjeiver^S 

■Having confirmed' his belief that tha cloze procedure is a valid indicator 

of readability, Taylor submitted that 'the cloze procedure should not be limj.ted 

to this particular aspect of readings He wrote: 

as such,^- its potential usefulness is by no means confined 
either to readability or to: the reading abilities of individuals.^ 

Sevei^al)' years later, Taylor -again wrote of the clpze'pi'ocedure, this 

time. expandS^ng upon his previous premonition of its potential usefulness,^ 

Referring to 'his initial article, he^ stated: 
. ^ ' - ' . ■• , 

At that time, this writer tended to' regard cloze prdcedure ' ■ 
^ mainly as a "new and better" way.^of determining ihe read- ■ , 

ability of printed English prose* Even "bhen, however, " : J . 

there were indications that the technique was not. limited to 
readability matters.' And now^-two years and many experi- 
ments later— the cloze method appears to possess a very 
^ large number of potential 3;'esearch uses, both practical - ^ . 
• _55ijtl^5^retical.-'-0 ^ ^--"""^ ^ . 

Taylor further^ stated: ' ' ' ^ ; ■ 

" ''The technique; seems to be aii effective g^iage of "individual differencGo" 
in the comprehension of readers ^. , 

By this time, the cloze procedure had" attracted the attention pf other 



investigators. Vfeintraub reported on the surveys of the literature which 
had. .been done by Rank^ and observed that while the initial - review of ' the 
•literature done by Rankin in 1?59 contain]^- a bibliography of 12 studies, a 



ERIC 



^aylorj op.. cit»j p. U32 . ' " a . ' , j ' ,• 

.'I'Sjilson L. Taylor, "decent Development in the Use of the Cloze : Procedure," 
Journalism Quarterly > XXXIII (Winter, 1956) j k2 ' , - . ■ 

.• 11 Ibid. •/ r . • 



\ 
\ 



- , ■■ A ■. • • ' ' ■ ■ ■ ■ ' 

siniilar review published in 19o^ contains a bibliography of almost SO articles .-^^ 
Bomuth, a most prolific writer on the top^c of clozg- procedure, , discussed ' \ 
the cloze procedure as a method of readability in 1963 

\ Hafneir foU'owed vjith an article on the implications of ;cloze in 1965 •''^^ 
Gallant obtained correlations betwee^ cloze scores and grade equivalent scores" 
s^n comparable forms of a s.tandaa:dized reading te^st, one, rewritten as a cloze . 
t^^st.-^^ Gallant's conclusions are toVe interpreted cautiously, as sh^ assumed 
the^cloze .procedure to' be too complex a ^procedure for first graders and modified 
it into, a multiple-choice ^^^t Thus, b\r offering possible responses. Gallant 

deviated from one of the -basic ground rules of the cloze procedure-'^that the 
subject- supplies the response 'pf his choice^. Gall^^ supplied no rationale for 
the method of selecting the. non-correct responses which appeared on the test. " 
Since,^ in almost all cases, one of this responses would not fit 'the blank due "to 
a difference in langua^ patterns, ./the results would seem to have 'little, application 
to the cloze procedure.-^ ^ / 

reene reported upon the use of the cloze .procediire in the primary grades 



•^^aniuel Weintraub, "Cloze Procedure,**. Reading Teacher , XXI (March, 1968, )567. 

■ -^-^John.Bormuth,: "Cloze as a Measure- of -Readability," Proceedings of the 
'International Reading Associatioyi > ''VIII^ (Newark'^ Delaware: International Reading 
.Association, 1963), 131-"3U*. " * . 

'^^L.E- Hafmr, "Implications' of Clozfe." Fourteenth Yearbook of the National 

Reading Conference , Inc^, '196$), pp. 13U-U5.:^ "\ 

■^^Ruth Gallant/ "Use of Cloze as ^a Measure of Readability in the Primary 
Grades," Proceedings of the International Reading Association ^ X (Newark, Delaware: 
International Reading Association, 1965), 286-8? o 



in 1965 and discussed many potential uses of the cloi?ie'' procedure in 1967.\ 
1. It was during this period of . time that many researchers bfegan, studying 
specific aspects of the cloze procedure. The more significant studies %iill be 

.1 ■ , • , ■ ;■ -.. c 

discussed beginning iwith cloze test construction;' 



Cloze Test Construction 
Taylor/s initial article, on the cloze procedurfe contained a suggested 
methodology for the construction of. a cloze test. This test was to be used 
to' cietei:^^ readabxl^^ty, or relative difficialty, of a specific, selection 

pfjreau^ * ^ . 



' 1. Depleting an equal number of words from each passage by some essentially 
random countiiig-out system*' Such a system is based on a table of random 
numbers or else is simply counted out every n"^^ word (every fifth one, 
for example) without any regard for the functions or meanings of specific 
words. . . 

Reproducing each mutilated passage with a blank of some standai^d length 
/ the length will not influence guessing) in. place of every missing word. 

3« Giving copies of all reproduced passages to ^11 subjects— or to equal numbers 
of randomly selected subjects~in a sample group representative of the 
population in. question. . . . 

U. Asldng all subjects to- try to'fiH, in all blanks .by guessing, firom the \ 
context of remaining words, what the missing words should be* " 

5* Totaling for each passage separately the number of times original words 

were correctly ^^replaced^ and oorisidering these totals as r^eadability scores . 



" Frank Greene, •^I'lodification of ; the Cloze Procedure and ^Changes jLn Reading • 

Test Performances," Journal of Educational Measurement , II (1965),^ 213-2? o 

r-'-'^Frank Greene, "Gloz^ Syidposium," Educational Resources Information Center . 
Document ED . 016 581 (Bethesd^ Maryland: ERIC Document Reproduction Service, 1967), 
Abstract* * 



6., , Contrasting the clo^e ^totals of various passages: the passage with the" 
highest score' was considered "most readable", the. one with the second 
highest score, next most" readable, etc*, pending the outcome of statistical 
^ tests of significance of the differences observed*18 . - 



Deletion Systems ^ \ . ^ > 

^ One Of the first aspects of cloze .test construction to c me under westigation- 
was the^ deletion rate of words. In Ms initial article, as cited above, Taylor sugg. 
.e'sted a deletion rate of . every fifth word. There was little evidaice at that time 
to support Taylor's decision, wrhiclEx, it seems, was completely arbitrary. * " 

- Taylor provided the rationale in his second article on the cloze procedure. " 
His statement was^ 'as .follows: "Also, it appears that an every-fifth-word deletion 

system spaces blanks as far apart" as they need to be. "19 * 

. . „ ;'j ..■ . ■ ' c- '■■ , . , ■ • . ■ - ■. ■ • 

The rationale /or, tHs decision was i^^^^ in a footnote in which Taylor ° - • 

indicated that Madow, a .statistician, had "...verified that a subject »s' performance 
on successive, blanks created by an' every-fi-fth-^brd deletion is statistically 
independent. "20 The problem of sta^tistical dependency occurs in deletion systems . 
of less than .every-fifth-^ord. The rationale of using- a deletion system of less than 
every fifth . word is based upon the benefits gained by ol^aining a greater number of 
cloze units from a given paragraph. This reduces the length of the test, lengtl^'^ 
being a point of concern necessary to insure, tH6 reduction of sampling error and to 
insure reliability. " ' .. ' ' ' . 



l^Taylor, "Cloze Procedure," op.cit ., p.la6, 
I'^^aylor, "Recent Developments," op. cit .,-p.U8 
,20ibid. • ; 



• Ha^Girfitie designed a study to investigate the effects of a less than eveiy-* 

^fifth-word dele tion'^^ystein* He concluded that: 

'■ , • ■ . ■ • '" ■ • ' . ' ■ 

• ••no statistically ^significant di was found in the .' 

difficulty of restoring omitted words when every 2Uth, 12th^ 
; • or .6th word \ras orrpLtlledf but,, omitting every 3rd word madq 

restoratioiT^more difficult #21 . 



MacGinitie^s concern with Jess than every-fifth-wofd~aeletiW s^^ 

« , ■ . ■- - ,. • . ■ ' . * 

matched by most' investigators,.' Many research studies have investigated systems 

which; allow for mpre^ than fourX words bet^jre en; blanks. \ 

' . ■ 'J' I ' • ■*■ 

Culhane based his requirements ^fca: a deletion system on the types of materials 



•being read. He recoronended the|'every-fifth-word deletion system for narrative 
materials but felt that an eveijy-tenthHford. count' was appropriate for textual, materials 

. that w^ Potter concurred ^irith this reaspning and even suggested a 

deletion "system of one word in. every twelve for. certain purppses.^^ 

McLeod and Anderson attempted to use a deletion rate of eveiy-*eighth-word and 
found that certain blanks evoked ambiguous responses. These blanks were' replaced^ 
resulting in. a loss of randomization, 'knd this procedure was dismissed v as being 
unsuccessful 



- 2l^^ftHeMacGinitie^ ^'Contextual Constraint in English Prose Paragraphs, 
Journal/of Psychology . LI (196l), 12^. 

Culhane, "Clo'ze Procedure and Comprehension,'* Reading Teacher, 
ffitllt (February, 1970), iaO-13. * * ^ . .. . 

|23Thomas C,' Potter, "A Taxonomy of Cloze Research, Part I: Readability ' and . \ 
Reading Comprehension," Educational Resources Information Center Document ED 022 6Ux 
(Betlrlesda, Maryland: ERIC Document Reproduction Service, ]l968). Abstract, 

McLeod and J. Anderson, "Readability Assessment and V/ord Redundancy of 
Printed Speech,"; Psychological Reports , XXVIII (February, 1966), '35-38. 

■ \' ■ - .! ■ '. ' ' ' • ■ '■ . ■ • 

\ ■ ■ ■ 1 -V -■ ■ ■ . . ■ ..■ 

. - ' 



r ^l2i view of ;t'he: findings of the previously mentiojged investigations^ * it .is v. ^ 
particularly interesting to note a study by KacGinitie. which inyo'ived 600 cdl3,e^a • 
students. HacGinitie used' deletion^ systems of one in ■si:5C> ' one in tv;elve, and one\ • 
. in twenty-four ♦ He foujid no significant difTerencos ainong^' the clpze test scores*^\ 
A. question also arose as^to tj:^e desirability, of making lexical rather than 
structural, deletions. ' It is conceivable that ploze tests be constructed by . deleting 
a specific class of words such as nouns or adjectives. TMs teclmique, seems justarieclt~^ 
in the event the investigator wishes to assess the language; competencies of a group - V 
of subjects. • It would seem that such a technique would be unsuitable for purposes 
. '^of assessing readability or comprehensioh skills. The deletion of a( specific class ? . " 
of words and the frequency with which they occur Jin a paragraph^ in and of itself ^ 
might be an ihfluencxng factor , which cou^rci affect a subject ^s cloze 3core by altering 
- the-difficyi^^ passage. _ Taylor^ this deletion tecriiniqud and 

concluded that: "Pixidings up to i^ow indicate that. .• there seems to \be. little ■ r ' ' 
-advantage in precla^sifying v/ords and limit-^ng deletions to them.. "26^ 

: Further, it would seem that oncej the randomized deletion ''rate 'is separated from 
the cloze procedure, the loss of ob^jeptivity ||ciuld i^educe the task to little more , 
*than sentence ccnnpletiQ)^. Also, one fof the. most iiaportant qualities of the cloze . , 
procedure, the, fact that the selection process of test, items is completely void of .. 
subjectivity would- be- lost* • ; " ^ . . ' \ . 

Potter^ statement, as a result of an extensive review of the clozn procedure 
best summarizes present thought on this matter. He vrrote:. "The practice of selecting 



\ • 

A • 

\ 
\ 



^^•laqGinitie, op« c-itv /,pp. i21~30. 
^%aylor, loc. cit > 



■V . . . ■ • . .- , ■■ • , .; 



■A 



\ 



cloze deletions on other than a prq-detennijied, raechaxdcal formula seens^^q^^ 

'ablo sitice a random samplfe' of the test. W . - 

Having established the most Sucpessf ul sustem to be used, attention will 'now 

be given to ^the total 'nv^^ of deletions necessary for incixxslon in i cloze test. 

• .In his second JU^>icle, Taylor >i4-gested a passage length of ^ 2^^^^^ Potter 

..proposed a series of ;:-guid^lic.esrg^ Icloze 'test_con8truction and.stated that : * 

. ' r.,(^) not more than 20 words ill evezy^^^ 
. ■ (3) paragraph leng^ih be: at least 2^0 words, (^^^^^ v 

words cV3 deleted- in order to assure,, adequate, sa^^ •' . 

pSssagesi^J' ; . ,*.. ■■■/ - 

Ranldn, in a^ study of students 'in gradss four to eigjit^ also used a 250-iword 
passage kith every fifth, word; jdeieted. 2° \ 1 here is little debate in the literature 
as to the. appropriateness of the 2?0-word length. Most investigators have accepted 
this standard* . 



A 



Blank lenfth ' • 

Vmile. t not seem to be an issue, the length of the 

blanks has been- given somV tl>ought. Early investigators have generally. atecspte^cl 
the logic of Taylor 's statemtent that blanks should bg of 'a predetermink length 
(us^ially ten spaces) so as to curtail the subject's a:ttempts to guess.'the correct 



' ^?PotW, loc. cit. '. - '.. • 

v-..'^ - „ .., ^ ^, 

. TayiV, "Recent Develoments,'' op, cit., 'Vj.. li8^ ; v- ' / 

SPpQttery loo» cit > • , ; • V ' ■■, „ ■ .| ' ' ' ■., - 

^arl F. 'liankixi, "Orade Level Interpretation of Gloze Readability Scores*" 
^ Educational Restoces-Infonnation Center Document J2) 0it6 05? (Betheadd, Harylarid; 
\ ERIC Document Reproduction Service, 1970), Abstract. 
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' . b^^t'speciried. that..,a- corisis^ent /bl^Mi^ length; , i^raportartt 'only ^df 

■r^:io b^^yused as,:a test^v... ^.pulnari^^, felt t Hat unintentio#l cluei?\qr'tKis ty]>e, wou^ '. .'■^r- 

V"" ari'Bci- -pupil perfunn^tco in ^ „•:■ .A...... • ''>" . - ' .fe';-'>». - '■^■^^ 

^ Ar-tdcrsoa desi^^tied study to investlvate .].haAer.i^^ blank'' length .on'^pe^jfor- ' 
, '■:.?nan,ce. and ccncl^aded ■■t;iat; ■ T ^'^^ ' ' • //; '"^v ; • • ' i^A-,.-,-^... ■>..:-■. -^^^ -A 



Bubjects -■on-^Iose' tests MBixig~:blmk^,,,Q£ ^ ^ni£om, ^en^^th, and tiiei-. 
in.^:j?i^:scor^S'^'b4: -aubai^ o:V.cld^ :test3 iislng blariHs of YtheVs^e V,' ' 

'^lid as jne^sjires of .general. 3?ekJing/coi?2pr^ehehsioriV 



le^irjth as the 
'wore ..aqiially\ ' 



■■■\::. 
. -X ■ 



>lpk length is ol'-wactieal ^^a^ .^ell ^s ^acMe^:^ic -interest* 



If Mctcv::^ i^^^searca^v|:B/ ■support 
^ V, bar^jre^^tl^T: sxjni^ljii'ied^ '^^ orilf^r =to consftinife^ 'a test utilii^inir a'^rsifarm^ 

•'.•.:' A /bl^nk length ;proceck:;fe^ it i^^neQessa:?^ .t^o^re^^ 

%^ f .arran/p.^:; . the ^jentence spacing ^s9^.as-vto;acconodata %e'" blank' jleii^ta of : teo' -stjaces* ' " 

■-^-fi .-:..: IJlt^^ r^ either prin^n;j- cpsts ;or ^^t^^^ , .'Hyen vhek-a 'prijtiajry - 

type>n:*x^er is^used., /t^^^^ ^d}^antages. of -p^int >ise\c^ readings ^-^^ 

•■- V-.- the 'questibn of /blaiik. len^^ not .received as ?nuch as- the .responses^ 
■ ' v?m"ch -rill, the ^;htonK3*- :.Ki^ch of • the/ ' early -Research - on ' the ciojje procediim was ' * • - .. 
■ • adch'ss^Bd: to ta.e pro^ile^n" or/v;iat v):kiXd. constitute on -sijifcropriate close, response* 



■ i 



: - ■ ' . . - ' ■ ■ - 0 * 

. ^ ^CtJtlhane ^ - op * ci t ^ ^ ' . pp ^' .,ftlOr ul:3 » / . ■ , ' ,;• 



'^Educatiopal Resources InfomatiQn/Center -Dv^curnent ■E0--Olo S^J^Xn^thQ^ds/j'^ 
■ ' ^'^lltt-^ Doc^inent Reproduction - Service^* 1970)^ Abstract'*.. " \ ■ \' . ^'"^[^ ' .; 



> 



favior vvis expiicilt'in his directions-, oex^talning; 'to-^plose respansc-rr* . ;{e,- 

racofnmended'^-^toialinijiffc^ pas^^age* separately the .mi/nber of times ori^jinal-.; ; •■'^ 

^- ^.^'Kl ... ^ • . ' • . • . ■ ■ ''31a ■ 

^'^ords .vi&re^'^orrBct^.y veplactd and consldaring thes^e totals as readability sconts.^K' 

tp M.S i^econd article^r rDublished in^l956j» T|ij?lor agaxn addressed. himselJC 'to '•. 

the pt*"dbxe?i} of -correct rasponsfess_ , . * ■ . ;.. 

■..•o E.-i.cb. time ■ the subjeqt cor.i-ectly gu.e3ses a mssing \ior&^ he 
•* scores ^ne^-:poiKt 5 M for, any particular ■ 

pa£?sage is'siisply t0e total number of missing* words tJiat. he . . ' . ' ' ." 
• fs^esses -correctly*3^ , " ' . • *' 

;Tm(55-> 'each clo?,e' passage^ -^elds a raw score based'' upon the prihiiple of one point 
per corjr^^ct respdhse*., ' .. , ^ ■■' . ^^0. ■ • • ■ 

; / theVp^<^fc^'^^^^^ -scores *orr' el\^^-.-. pdssagto Arises .>/hen passages, differ in ' 

length*^- To resolte this probie§>^:s^ scored ;usu^dly ;:Conv^e percentage/ 

<■>--.'■'.". , . . • - .-' 

scores; . '■ - '' > . .. - -<■:".'> ' / .. ■ v ; . "' " - ci 

: Taylor n^. stipttlation./;t^?^acbept only, exact-'word ^replacemfenfe -res^^ ' 

tij^: ole.TX'nt- oFm^ acoring; th'0-tests»\ A's de^sir^ble a' feature . 

as' tras may bs-jj many vese^ciiBrs hegm i effects that accepting 

alternate^responses had upon tile validity' of a cloze te^st# \. - 

One of the ear^liest of the alternate re:sponses to be Investigated was synohyms* 

ap-^emeht to accept •:tjyhpnyins\i3' logical* -Our. lan^age l^ackg-^specif^^^ ^ 

ve-use many >?brds to describe a particuiar object* Taylor felt that acorang synonyms: 

did not . Jiistify the extra effort involved, Taylor cCntenQed that there^ waJS 



^aylca*, "Cloze ProceduraSj" - op^dit f ^ p» i{l6v "-S. 
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•^Taylor,- **Recent Beveiopments,^* op> c it^^ ^'- p*"h3^ v 



/■'■ . * 



\ nd'advantai^e to'"^^^^^^ oneself to tne trouble of judging and 
• \scorlng.s;mon:,nns ; ^ to now indicate that the easiest 

'^ays of appi;/ing;cloM proceduigfe jna^ be best for most usevS,36 ^, 

;' Gulhane supported Taylcr'^'k be/xief s and v 

' It is possible that sj/nony^ns/.rnay be scored as correct^ but this ■ 
: Vould make scoring cumbersome and lead to -arbitrary decisions ' 
x^egarxiing the vorth of tne/3j»naon,>^ns as\ a 'replacerqent»37 .. 

In a study vraica-coji^pared Vtae exact 'v/ordi-Keplacement metnod with the 

SYt\ot\:m replacement metnod, fo^ddell found no signj.ficant differences between 

the two .,SQorin;j methods in terms of the validity and reliability of the tests. 38 

Gallant decided, as a result oi her study-, that accepting substitute words 

wi)ich' approximated the meaning of the>vord dbleted aqd a^rrreed in ooth person 

and tense v/as not statistically, significant even though -higher correlations v/ere, 

•obtained using the .substitute scoring method*39 

Since not all elementary school children are proficient spellers, the 

■ : ■■ : > ■ ^ . _ ^ ■ . ., ■ -. ■ ' ■ 

problexn of how to interpret misspelled words was acknowledged by Bormuth. He 
suggest<^ -I accepting misspelled words when the response is otherwise correcty" and 
when the rriisspellin/j does not result in the. spelling of another word which does 
not fit tne s-rntactical context of tne cloze blank. Bormuth also conducted 
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various studies to deterinine the effects of alternate responses upon validity . 

.■ , • • ' • if' .. . 

. and concluded that the most economical and phjective methods of scoring are to 

be preferred. ^ he decided, yields the most 

valid results.^ Jongsma addressed this problem^ after an extensive survey of the 

literature on the cloze procedure and observed: "J 

Tnus it appears that the literature consistently shows the scoring 
'•jf.^ of exact word replacements to be ^the most objective, effi(::ient^''"alSd^ 

. useful scoring system to .use with the cloze procedureT^ 

The preceding sections contain a discussion of some of^ the techniques and 

problems of cloze test constructioh, . The following* section examines the question 

of whether or not the cloze procedure and cloze tests are valid and reliable 

m^easures of readability • -. 

. : .-^ ' Validity and Reliability of Cloze Tests 

as Indicators of Readability 

■ ■ ' ' % . ' . ■ ■ 

In his initial publication on clo^ie procedure, Taylor supported the validity . 

.. ' ■ ■ ■ ■ " . ■ ■ . - ' \' ■ ^ ' * ■ 

of the cloze test by comparing if with the Bale-Chall and the Flesch readability 

■ - ■ ■ I. ^ - - ■ ■ V ■■; 

fonaulae. In this study, Taylor demonstrated that standard readability formulae • 

' \ ■ ■ ■ ■ ■■ ' ' ' ■ ■ . \ / ■ • 

can' be "fooled" by applying these formulae to passages written by 'James^ Joyce and 

- ■ . • • ■ ■ ; - - ■ \ . 

(jerbrude Stein* These authors* writings are ranked as more diffi(ault'\by the clpze 
procedure than- by the Dale-Chall and the Flesch forrnulae, Taylor explained this 
by pointing out that the Dale-Chall and the Flesch formulae cannot evaluate - / 
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comprehension difficulty as can the cloze procedure wnich^' Taylor felt, measures 

■ !/• !•■ • . • ■ ' ^ ' ■ ■ . ■• - ^ ■ ' 

the factor of poncept; difficulty. The cloze procedure, he contended, does not 
suffer from the inadequacies of a pre-selected word list*^3 

A further point -^o be made in support of the cloze procedure can be found 
in the method it'sad to 'determine the readability of a selection of reading material. 
All too often, the readability level of a book is found by taking three sample 1' 

' • - ' ! ' . ' , ■' 1 

^selections and computing the arithmetic mean of these samples* Inaccuracies 
occur when these samples, are average d<^'. The mean readability score is often £ar[ 
below the score. of the 'most difficult selection. For example, the^mean score df 
the following readability levels 2>0, 5*0, and 7.0 is I;.?. Some might conclude 
that this particular selection is siiitaiDle for fourth grade pupils who are at I 
-grade level at the end of the year and -that it is certainly suitable for most 
;fi£th graders. However, in ord^er to be capable of reading the entire .selectiori, . 
a fifth' grade pupil is going to haye tb'cope with some seventh grade reading | 
material.* • 

The cloze procedure, which does not make use of arithmetical avera^^es, sedms " - . 

■ ■ - ■ . ■ ■ - ■ ■ v. ■ \ ■ ■ . j ■ '■ 

to yield a more valid indication of readability • Further, most readability formulae 
are passive: they are formulae 'applied .to the selection without input from the;- 

■ * ■ . ; - ■.. . \. : . . ■■ ■ .[ 

pupil. The cloze procedure might be termed a more active method since it involy^s- 
the reader ae he is reading ,the material. .1 

Froese compared the validity of the Dale-Chall readability formula for 
sixth grade science textbooks with a cloze test over the- same material. His 
maijor conclusions were: ■ . , / 



. ^3Ta;/lor, "Cloze procedure," op. cit. , pp. I4IU-38. '\ / 



(1) Cloze tests were| reliable measures of language difficulty stnd 

(2) the Dale-Chall readability formula is not, a valid measure of 
sixth grade science textbook materials. -wi;^en the cloze procedure is 
used as a criterion*^^ ' * ■ - ■ 

In a study to determine grade level readability scores. Fry compared the 

Fry Readability Graph with the Spache jToarmula, the; cloze procedure, arid oral 

- .. " • \ . ■ ■ . I : . i ■ • . 

' ' ■■ . . ■' I ■ i. 

reading using seven primary level textbooks. He reported high correlations for 

all four methods and concluded that "the cloze method was the most accurate and 

the most capable of making fine distinctions."^^ .'However, Fry felt that the 

cloze procedure had limi"t<g:d use, since it required the presence of a. group" of. 

subjects to read the selections. 

Weintraub also inyes^tigated the question, of cloz^ validity and reliability. 

His conclusions^ based on a review of several studies, were 

• .(1) that the cloze procedure provided "a" more, reliable measure of 
readability for non-standard reading materials than commonly , used 
\ . readability fomulas, = (2) that close tests are valid and reliable 
. predictors of .the ;difficulty level of reading materials, and ;v 3) that. 

scores on comprehension tesv,s correlate highly with cloze | readability 
. scores. " These findings indicated that the olo^e procedure could be 
used' effectively as a-measure" of readability and of comprehension.^^ 

It appears that the validity of the cloze procedure is an accepted fact 

insofar as the measurement : of readability is concerned. The following discuSwion 
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centers around the use of the cloze procedure as a valid arid reliable indicator 

■ . ■ ' ■ ■ • .. » 
of comprehension skills • . ' ' « ^ 

*' . ' ft ' • , . • '. ' . . * 

y^idity and Reliabilit^y of Cloze Tests as " • ^ . - 
Indicators of General Comprehension • ^ 

Cloze tests appear to be a valid measuring device based upon the following 

logic relating to. face validity. . 

Face Validity of the Cloze Procedure 

'Both standard achievement and teacher-made tests in reading most often rely 
upon the use of multiple -choice or |sentence completion items to test reading 
comprehension, skills. A multiple-c'ihoice test readily lends itself to the process 

of elimination m selecting, correc t; -responses. Many of the responses to the stem 

: ■" . -I ^ - ' ' ■ ' / ' " ■ 

of the question ai^e easily eliminated by the . subject as he cori^ider^ the -subtle 

clues inherent in the construction j of multiple-cnoice questions. Differences 

in* verb agreement^ redimdancy of sentence style, syntactical mis-matching and 

illogical alternatives furnish the' more sophisticated test-taker with a variety 

^ of clues ^hi^ch - guide his choice of responses. SubjetrtsCsften disregard what is' 

obviously -wrong in an effort; to determine what is probably right. Further, the 

statistical benefits of ,guessing should not be overlookedo Bormuth offered two 

sets of scoreSfcin the results of one of his stuiiies. The first set was raw 

■ . . . e • . ■ ■ ■ 

test scores; the second set had been statistically corrected for guessing. 

Bormuth stated: ; ■ 

A set of corrected multiple-choice scores was also calculated. . 
.This calculation was. based -'upon the aa sumption that a subject's 
score is made up of two,c6nfponents. The number of items on • 
which he knew the right ''-answer, and the number of items on which . « 

he guessed correctly. It was also assumed, because there were 
four alternatives for each item', subjects guessed correctly by 



^one-fourth of the items for which they did not knpw the correct answers* 

Sentence completion test items do Aot suffer 'from the previously mentioned 
drawbacks* However, in bath the multiple-choice and sentence completion type of 
test.question^^ vocabulary .control and concept, control of • the questions are factors \ 
to be considered.! Generally, test! items are derived frcsn the text but often cantairi, 
grammatical constructions and vocabulary' which are unfamiliar to the subject taking^*i 
the test. The cloze procedure^ which draws from the actualt material itself for test | 
items, inserts no I added vocabulary or changes in grammatical constructions. Rather, ■ 
the text is the test. The problem of a pupil comprehending the selection, but not 
the question, can never be encountered in the cloze procedure. / , . ' 



Concurrent Validity of the Cloze Procedijt^ ' . v \ 

Most of the attempts to prove concurrent validity have dealt with the natur^ ^f 
the relationships among the cloze procedure and other :standardized instmiments and 
procedures. Taylor, found correlations of .76 between scores on a cloze test an^ a 
comprehension test made from the . same material* Bormuth reported correlations of . 
.73 to o8ii between cloze test scores and conventional test scores made over the same 
material and stated the correlations .approached 1.00 when corrected for the unreliability 
-of the test. In a similar study, Bormuth found correlations of .92 between cloze - 
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readability* scores over, a passage and. the difficulty of the same . pass age . as measured 
by. multiple -choice tests. In a later study^ Bormuth attempted to determine the 
relationship between the cloze procedure and word recognition and comprehension as 
septate' concepts. He used all four forms of the Gray Oral Reading Paragraphs Tesi *• 
and found correlations of .90 to .95 between the cloze test scores arid word recognition- 
.difficulties. ■ Corjcelations of .91 to .96 were found be tvreen the cloze test scor^es 
and comprehension difficulties relating to the paragraphs. -^"^ An interesting light in . 
which to view these findings is this: if correlations^ exist among cloze test scores 
and comprehension .and word recognition difficulties, ah assumption may be made that a , 
correlation should exist between scores on a clo^e test and scores on an IRI, an 
instrument which measures both xord recognition and comprehension abilities. Literature 
pertaining to this ]poiht. is^^^^p of this chapter. The . 

following section is concerned with the' construct validity of the cloze procedure* 

Construct" Validity of the Cloze Procedure ■ ^ 

The literature reveals little agreement on this\subject. This lack of agreement 



seems to be due to the absence of a theoretical framework, since' the processes a 

■ . ^ ■ " : ■ ■ ■ ■ 

person goes through in ^ completing a clpze test are not presently known. An assumption 

. • • • . . ■ ■ "■' \ ■ ' \ 

is made,A and perhaps rightly so^ that since the processes necessary for performance <. 

on a cloze test are not unlike the processes of comprehension itself (as evidenced by 

high Correlation coefficients), a similarity exists. Jenkinson's studv investigated 

this problem. She attempted to examine the process of-^comprehension byx asking her 

subjects to "think aloud*' whiOTe^'^.cpmple a cloze test. Thus, -^she devel))ped a 
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clarification, system for analyzing the process of comairehension.^^ 

Kingston and Weaver applied the principles ox factor analysis to the cloze 
procedure" in an attempt^ to determine the proportions of v.ariance that could be 
assigned to those factors vftiich are basic to tests of reading ability^ namely, 
vocabulary and language aptitude • A series of standardize^ tests, four reading 
cloze tests, and four listening cloze tests were administered to juniors in 
" college. The factors which were isolated were a verbal comprehension factor j 
a cloze factor J and a rote memory, flexible retrieval factor. ^3 The data 
showed that cloze ;tests were not.,related to the. above threQ factors. Weaver 
and Kingston concluded that "cloze tests are related only inoderately to the 

verbal comprehiension factor .'^^^ Bdrmuth reported" several criticisms of the 

■ ' ■ ■ . ^ ' ' ' \ . " 

above --mentioned study. . ; He ..designed his research in an attempt to eliminate 

many of the limitations he found in the study done by Weaver 'and Kingston. 

It would be appropria:te to discuss the two studies by contrasting them, 

. since they are similar in scope, and Bormuth's study is an-;, attempt to rectify 

certain, "wrongs" 'Which, he found in the Weaver-»Kingston study. Bormuth elected' • 

to use ,an every-f ifth-word deletion system as opposed to the selective, deletion 

system used in the Kingston-Weaver study. Bormuth randomly selected fourth, 

fifth, - and sixth graders, while Weaver jand Kingston used college juniqrs. In 

the area. of criterio/i measures^ Bormuth constructed his own multiple-choice 

comprehension questions and based the tests' validity on pilot testing and 

professional judgement; Weaver and Kingston used standardized tests as their 



^^arion Jenkinson, "Selected Processes and Difficulties of Reading CQmpre- 
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criterion measure s» Bormuth concluded that^*"#., on^- factor accounted for the 
.preponderance of the variance, ./."^^ B<:)rmuth found • • little d|ffuculty 
applying the name 'reading corhprehension' to that factor. 

^^^^"^YiQ^m^^ov studies vrhi^ch dealt mth construct, validity have, been presented* 
To , this d^teJ-^-there ^d^no research to prove positively the construct validity 
of the clozQ procedure. It is appropriate to as siime thei existence of a reia'-^ 
tionship based upon the high, positive correlations that have been found between 
cloze tests and many reading comprehension tests. ■ » 

One, of the first studies to in1j;estigate the validity and reliability of' 
cioze tests as a measiiring tool for general .comprehension was^Wnicirison's study ' 
of 19^7 • Jenkifison made deletions on the basis of word difficulty. Deletion 
rates, varied from every third -word to every tenth word. Correlations l^^ween 
the jclbze test and the Cooperative Reading Test were .78 with the vocabulary 
section and .73 with the comprehension section. ^'^ v 

Hafner found correlations between cloze test scores and performance on-thet^^ 
Michigan Vocabulary Test (for college students) . A correlation Coefficient of 
• 56 was. found .^'^ . ^ \ * / '^^ 

\RuddeH's study usi^d -an every-fifth-^or^digletion^ system controlled for 
high -and low frequency patterns of^bral language. Ruddell -stated tha.t split«haif. 
reliability went -as high as .97*^^ He also found correlations of .61 to .7& 
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between cloze scores and the Stanford Achievement Test, ' . • 

/ [ ^ [ ' — — V. y 

' Greene foiind lower coefficients between cloze scores and total compreriension. 
A correlation/t>f .^1 was found between cloze test scores and the Diagnostic. Readin g 
Survey , VJeintraub also found high congelations' between cloze readability scores 
and scores on comprehension, tests ^ ' > 

. Gailant found the reliability of cloze tests to be high, rang ing from .90 to^ 
•97 • These were significant beyond the .01 level of confidence , -^d her -conclusion 
was that cloze tests were reliable for all three grade levels tested^ ^2 Bormuth 
also computed reliability coefficients by two methods and obtained coefficients from 
•70 to •95^^3. ^ / ^ , - 

Kirby designed a study to determine whetjier cloze test scores would differ 
significantly fromi standardized pral and silent reading test scores^ The t^sts used 
were the Gilmore Oral Readdjq^ Test j the Gray Oral Reading Test ^ and the Gates Oral 
Reading Test . ° Statistical /differences among these tests Wre significant for grade© 
one through three^ Ho^jever> for grades five and six, there were -no significant 



^%*rank Greene, '^Modification of the Cloze Procedure and Changes in Reading 
Test Performances,^* Journal of Educational Measurement , 11 (196?), 2I3-17^ 

^-k^eintraub, op^ cit >, p«.2. 
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differences awng the mean scores, ^Hean scores on the clos^j tesi.S and mean 
scores- on the Gilmore test did not differ a't these i^^pr'ade levels • . llean scores 
.on- the Gates Oral -Reading Test did not .di.ffer* .significantly except;W ^jrades 
one and two» Mean cloze, scores did differ significantly-^ from mean, tlest scores 
on the Gray Orril Re ling ijest for the entire sainple.^li ' 

-In much of the .literature on tais subject^ investi -ators have devifted /X 
from ar?re'.."d-upon close test construction ^ocedure.s such -.as deletion ra^e/ res- 
ponse scorin.,;, and other- factors of test -aonstruction. Because of thisV^ their ! 
conclusions are less useful than the conclusions and findings of the above- > 

mentioned s.tudies, / - .. '. — ■ ^ 

"'! - .. ' -^'^ ' ' ' ■ ^ . ■ •' ' ■ 

The preceding discussion dealt with tiie validity and reliability of the close 

prociirdiJire. / T5ie opinions of many investigators a^^e suninied up by Bormuth's statement 

triat cloz0 tests .are valuable ... because they are highly reliable and valid, 

andean be easily and objectively constructed arid scored* r " ^ 

I / . Interpretation of Cloze: TesiJi;^ Scores ' . 
The problem of hov7 to interpret cloze test scores has plagued researchers • 
more than -any /other facet of ' t:ie cloze, procedure* Tnitiallj^, raw.'scorj^s were 
used J bu| tids method was of little value. Since individual |-rade levels of "a 
^ close test'"n/>rm^lr differ in lehgth> rm scores do not lend themselves to compar- 
ison. In an effort to eliminate tiiis problem many researchers have converted raw 
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scores to pe£»centago scores, thus enaoling conparisons to be made « Although 
slightly, mora useral, pei''centaaq scores simply i^lrs1 is h an iridicafcidn of rank* 

J For example;^ if a pupil scored 55 per cent on a specific fjrade level Wssace, 
gnd h:is classmate scored '1x5, per cent on the same passage, , it is at once obvious 
which pupil has performeci less batisfactoilly on the test* Similarly, should a 
pupil score 60,per^cei:t oji a close passage of fifth grade diffic^ty, 'and make 
lower scores on each, succeeding grade leve:|^^ it is a safe; aasuinption that the 
fifth* grade passage elicited iia be. ?t performance. What is hot kno^m is how 
preciseljr tne reader has oofnorenended tije material on any* selection*. The prob-^ 
lem i^ t1iaf conventiorial .measui-'eis of reading ability^ are usually expr^ssed^'^a^^^ 
grade equivalent sc9res# A method of , converting close test scores into jjrade V 
equivalent scores » is heeded as well as a' criterion score to indicate m accei?- ' , 

-table performance level on*- a cloze test* - " ' 

A logical- approach to this farbbl^-a may be' to adopt a criterion score * 
equivalent to a performance^ llsvel which has t^raditionaliy met' >rith acceptance* 
V\ survey of the Ixterature reveals a basis for the selection of -such a oer*-" - 

. formance level* Authorities in reading have been consistent in, accepting 'the :> 
75 per cent level as acceptable performance level" in reading j3omprehe;isi(me 
The assuiaption is mad^i' that material on w^dch a pupil is capablo of ctoiprehending* 
75 percent of -r^hat ha has^rea^ is of suitable a^^ficulty to be used for ^ purposes 

rrp£ guided instruction,. ' . ■ -v) ^* :^ ^ ^ . , . . ^ 

Perhaps the earliest mention of ^the 75 per cent .performance* criterion vas 
made by Thomdike in^ 1517» >He suggested the use of the 75 per cent perfoM 
level and mainta^jned that if a p^^pil i^Jas capable of ^^ansviering 75 per cent of the 
comprehension tftiestions ccver%g. the rea^iing material;^ such.mateirial was^ suitable 
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jfor'useln supervised instructiohkl situatrons*.^ - ' . ' ' • ' ' ' 

Colmm^n study at^tempted'/to d^tpi^ndhe' vhicH level of par%ra;ph dxf jPiculiy s 

. reaulfced 'iiri the. groatest arioimt of inXo^m^^^ g^ii»; He -tased ^ transxxi'irancy wlu 

■ • : ^ ' . \ v ' . < - . . ■ ' ' •/ 
•coBtai.nfed selected, reading pas. ^ages* ^Vfiule th© .su!ijeats .*i?atch8d: the 'is-rdiected 

image on . a ijcreeia, Colem<m peeled back the strips if tape v$»>dch goto x'^d ihe \. 

. " , sentebfees* is the tapa^'^as/rfer^ioved in' a left-ljo-rit^t niotion; Coleman asked: his ' ' ' 

■ subjects to mess the succe'eding words. Tbe,iap0 replaced,^ a« ."tha coibpXetion * 
f. - of ,tfds ^3t miti ll^G "procedure re^eafcM, {^leniah de&^ 

.• \" ./the .diX fe:&€iAce listween/t scorea on botiu tests •^^Hasdmm' ii^^^^ ' ' 



gain occixrred' oarpassages on wnich the a su,b;|*ects ^scored close to ai4 nercent;^?' 



k Slight 'cautt^ux^s .advisable here irf 'that yisual^ meniory is-,iikely-.to have affected 



tnese "SiCores, 



^ Aftong'^^eading authoix'ities,y3etts adapted :the\ 70 per^cent comprehension ,;v'^ .''^ 
- level -and it- beca^ne ppt^ of jiis IRI i'^^. Harris 3, in .1962, suggested the use of t.hip ^ 

■ ■ ■ ■ .- * ' - ■ ■ p( ' '■ ' ' • „ - 
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. ; „ Betts/ Foimdafcions. or Readim^ Instr^otlon (Kevr Yo^k: >4m^lcan ; ' 

Book i Company > 19h6)t ^ "'''"^ """"^ ^ . • 

>. ■ , . ^ . ^ .^ ^ .- '. ■ .. r ■ ^ ■ ' • • [ 

' / - ^^A.. j: Harris^ ■ gfrective Teachl.ni ^ of Reading (l^ew York: . D3VidMacSayj '1962).* 

* - '''.}■' ■ - ' 
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Qorrectyan ('Uew York; Apple fc6n-Gentur*^-Crort, 'X^G?;. • ' * 
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' - ^" Bo-rauth "QueistionGd the adoption of the 75 j^er cent criterion and stated, after 

.a.reyiei^'or the,;literatiire.> t-mti . * . ^ 

' • ■ '. . ■ ' . ■' ■ ' ■ ' ■ 'i * 

. ^ The problem •is that neither -rburndike nor ajv>^ne kriavn ' to 

. has given cither .logical or /?]^pirici3L3. i*oa3onS :as to why ttese standards 

■ ''|5hou3-d be aciceptedv , - ^ ■ ' 

"\ c/ TMs- oroblej^ is potentiaia;y too serious to be l^ft Ainattended*71 
This, raises an intei-esting point of contehtion* Boxtnuth obviously felt it was 
ncn;easar^ to have emotrical evidence to justify a concept^ w practice,- w 
-£pjimel' accepta^nce^ ^^^^^^^ reading' atzthori ties and vjitlch does, indeed j seem to «work^ ' 
for .t03x^>mr^-imA. diafjnosticiansi Kxirthenniors, it seeiiis /that tius performance 
criteriiDh has be^Hi'-'vali^^ use* . !iot content W2.th non^-eiTiptrical valination, • 

iiomuth deaigned a series of stUjiies in an "attempt to arrive at criterion levels 
for interpreting oio^e test^* score si" Interestingly enough^ several of ' Bormuth^s - 
stuaies .3?ielded,. iiif ^rin^tio^^ similar to^C^man"*s :^t\idy*- ■■■■ks prevl.ously*' stated, 
Colarnarj.*f : atudy:. used a.mc<uried" cloze /procedu.r^i.jf -M^d. he suggested that rnaxiinuin in*^ 
Xomatxon gain ■occurfad at the iiii^'per cent' score* Som^llth^s later studies yieXaed , . 
siml^u^ 4^ifomatioii* ^ - . ' . ■ ■ ' - ^ 

; . . In M?5f.iMtial>>3tm^^ ai^d aVthe problein ^ol' justlf^ng the 75 per ; 

■c^nVr-m€'9Q j^r uent co^tvpreheVision acores (90 .par cent comprehension' is indicative 
^■o£ ' ti^ '^iio^ ceiit isvlndicative^ of ' the; instructional level)^ - and 

establisjilng a, -f ra;ii^- ...of ;reif er^noa -Ivcm ulilch- clo5;8. ;tcst .. scores:. could ■be interpreted, 
BP?T^utn"fo^md that .pair?^;/^:vf pupils scores-.of =-lesB that 1? per cent'' on cloze 



; -^^ohn 3ohmrth;j'>B%irlqaI.. Detent of .the Instructional' Reading' Lev<?l^>^ 

-j^ggM^^lij!ii^3^^ aeadin^^ As-f^ociatibh, Xlil^- Part X (KewaAk^ Delaware 

Jntexn^tio^l Reading A-saoci'atdon^' 1,?58 j ^ . 716i • t % ;■- 



tests exirLbited little information gain. Those pair's- of pupils whose scores were . 

in the ra^ge of 17 peV cent to 37 per- cent showed a sharp increase in information 

gain.72 Pupils • whose cloze scores were higher tended to level off with but a s3.ight 

increase in information gain. One of Bormuth's hypotheses was that the' level of 

maicimuja information gain on a cloze test, that is, about 37 per cent, is comparable 

to the 7$ per. cent level of performance which has been traditionally iccepted as the 

instructional level when ascertained by multiple-choice questions. - Similarly, he 

found that a cloze score of -57 per cent. is comparable to the 90 per ,bent level of 

comprehension mentioned earlier Bormuth cautioned against accepting these results ^ 

based Solely upon one studv-. / , ^ * ^ ' 

Bormuth supported the results he found in his first study xd.b|i a similar second 

5tudy. He constructed' tv:o reading tests, a cloze and a multiple-choice test, over an 

identical" reading passage. Results indicated that a multiple -choicb test score of 

75 per - cent is .comparable to a cloze test score of l-Ui per cent over the same mj; ^r.. . lal- 

Further^ a multiple-choice test' score of 9Q per cent is comparable to a cloze test 

score of 57 per cent. 73 = 

■■ In the next study in -this series, Bormuth contended: . ' ' 

\^hen a. student 's score falls between and 57 per cent, on one 
of these "tests, the materials are- at the level of difficulty -^bought 
' to be suit^le f or use in his supervised instruction, .Materials on 

wliich a student score is' above 57 per cent are suitable for' use in 
his " independent study. . ^ 



72i6id.^,^ pp;* 720-21. 



73john Bormuthj ^^Comparable .Cloze and Multiple-Choice Test Scores," op# cit ., 
pp- 291-99.^ . :u 

7it^ohn.; Bormuth, "Cloze xfeadability Procedure," op. cit ., p. U3U. 



Culhane and Rankin, in siinijig^ir study, reported a cloze test score of [J. per 

cent to be comijarable to the 75 per cent' criterion, and a close test score or '6l -'--.^^ 

per cent to be comparable to the 90 per cent criterion* Culhane' and Rankin concluded: /. 

In vievj of these findin.-^s, it is novj possible for teachers to interpret 

cloze test-'scores idth some degree of confidence by using specific percentage - 

scores as criteria of acceptable performance • 

The scores attained by Culhane and Rankin differed from Boinuth's by four score 
points* Bormuth's'scores of l\k and 57 were actually scores which" had been statistically 
corrected for guessing* This correction was based upon the" assumption that an 
element of guessing ■ enters into a multiple-choice testing situation* Bormuth's 
conclusion was that uncorrected cloze test scores of 38 per cent and 50 per cent are 
comparable to the. 75 per cent and -,90'-per cent criteria respectively* _ ^ " .-.^ 

In a fourth study, Bormuth designed a testing situation which is quite similar 
t'O the tasks performed b^f subjects during the ackini strati on of an TRI* tJsing different 
subjects and materials, Bon.:..th asked his subjects to read the passages and respond 
to questions orally* Again, a cloze score of per. cent was' found to be comparable 
to the 75 per cent criterion score,76 ^ ■ 

Rankin designed a study to investigate the vali'±Uty of BormuthVs criterion scores* 
He concluded that the scores were indeed valid and submitted: 



75j. P. Culhane and F *^ .Rankin, . "Comparable Cloze and Multiple-Choice 
Comprehension Test Scores," Journal of Reading , XIII (December, 1969), 193-98. 

■ ^^Qormuth,. "The Implications and Use of the Cloze Procedui^G; in the Evaluation 
of Instructional Programs,"- op* cit *, p*. 23* 



^ Using the i;!;- per cent criteria (i»e., 75 per cent multiple -choice com- 
preliension), it is clear that there "^is a fairly close correspondence 
^between results of the formulae employed to check the .criteria. 77 

He further stated: • 

In using the 53 per cent criterion (rounded up due to test construction 
concerning fractions of raw scores), we can evaluate the Independent Reading^ 
Level by using^ the .cloze procedure,78 

In summary, it appears that, sufficient justification- exists in the literature 
for assuming that a cloze test score' of. Ui^- per cent- is comparable to a multiple- 
, choice or sentence-completion test sgore of 75 per cent. Anci it appears that a cloze 
test score of 57 per cent is comparable to a multiple-choice or sentence completion 
test score of 90 per cent. The assumption may safely be made that the kind of reading 
skills neederd during the completion of a 61oze test is little different from the kind 
needed on other t^/pes of reading tests. Boi*muth«s- contention on this point is as 
follows : . 

The correlations betv^een close readability and conventional comprehension 
test scores are high, and none of the research has presented convincing 
evidence that the processes employed in responding to cloze. readability ^ 
tests are, in any major sense, distinguishable from those employied in ' 
responding, to conventional comprehension tests .79 / 
■ ' ' ' .■ . ' . . ■ . V • . ■ . ■ ■ 

The purpose of many reading tests, notably IRIs and standardized tests, is to 

give an indication of the instructional level of the' pupil. A number , of studies have 

"kieen designed to investigate whether cloze tests are usable for purposes of determining 

" 77p^^nl^n, op. 'cit ., p» 6.' ' , ' 

78ibid. ■ • . : - . . ' ' \ 

■— ■ • ^ • • ■ ■ ; 

79Bormuth, "The Implications and Use of the Cloze Procedure in the Evaluation | 
of Instructional Programs," op. cit >, p. 25 • . ,• \ 



; . ■ ■ . ' - : , , .■ ^ 30 . 

a pupil's instructional level. A review of these studies follows. 

'.■ > ■ . * 
The Use of Cloze Tests to Determino 
, . . ; instructional Level^ „ 

i ■' ; ^ " ■ ■ . ■ ■■■ ■ - ■ ■ 

The initial study in which an attenrpt was made to determine reading levels by 

• 0 ^. " 80 ' ■ 

cloze testing was conducted by Ransom in 1565. Ransom administered an informal 

reading inventory and a cloze test to her po^nilation sample and investigated the 

relationship between the scores. With no research on criterion scores available, 

• . . ' ^ \ \ ■ ■ ■-: ■■ \ " ■ ; 

Ransom set performance intervals after an examination of pupils' scores. No further 

justification or rationale .was given to validate these internals enipirically. Cloze 

■ ■ ' . , I \(, « . ■ . . 

test percentage scores of ^0 and -above were considered as being comparable to the ? 

independent level of the IRI.* Percentage scores which fell in the interval of 30^ ' 

per cent to 50 per cent were considered, as being comparable to the instructional level, 

and cloze scores below 20 per cent^'were considered as being comparable to the 

frustration level.^"^ . > 

■ ■ /\ ■ ■ . ' 

After applying these p.erformance criteria to the scores, Ransom reported statis- 

tically significant correlations between the cloze test score^ and the IRI test scores ■ 

at the instructional and frustration reading levels for. all grade levels except firsjb^ 

grade. For the first grade level, correlations 'failed to achieve significance for' ' 

all -three rea(iing levels. Correlation coefficients for the independent 

.reading level were not significant at the .01 level for a majority of the grade 

■ ^ ■ ■ ; ■., ■ 

levels tested. > \v - ^ 

It is interesting 'to note that Bormuth's empirically validated criterion ^ 



^ Runsom, loc. cit. 
Q^ -Ibid. , p. 39- . 
•'^^ Ibid ., p. 61- . - ■ 
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rahgd of 38 per cent to $0 per'^cent (not corrected for guessing) is quite 
similar to Ransom *s range x^iiich^ it appears ^ was the result of an arbitrary 
decision. ' ' . \ • . ^ 

Kirby^s study^ which used the sajne population which Ransom had tested, 
contained a comparison of scores made by pupils on several standardized oral 
and silent reading tests irith clloze test scores. The perf ormancevintervals ' 

used by Ransom in her study were also used by Kirby-ih the analysis of cloze ' 

. " ! ' ' ■ ' ■ ■ ... ■ . » , 

test performance. Kirby stated in her conclusions that "the findings of this . 

study indicate that the use of |a cloze test by classroom teachers for determin- 
ing instructional reading levels of children is a promising technique. 

In, a . study similar, to Ransom's, Kirchhoff investigated the relationships 
between scores on a cloze test and scores on an IRI using a population of 
first .grade children being taught by different methods of reading instruction. 
Kirciihoff found significant correlations between the cloze test scores and 
score's, on^ the IRI for all first grade children. In combination. Ransom's and 
Kirchhoff »s studies found significant correlations for all grade levels of the 
elementary schools. It appears that -Kirchhoff also used Ransom's criteria for 

■ ■ ^ ■ • ■ ■ ; 

interpreting clozb test scores. Kirchhoff ^ submitted: 

The correlations between the cloze test scores and the informal 
reading inventory scores at trie instructional and frustration reading 
levels vrere" statistically significant. This close relationship would 
tend to indicate that a cloze test could aid the first grade teacher 
in determining the appropriate readinjj level for children nearing the 
end of their first ^;rade experience. ^5 . ■ . 



\ ^^Clara L. [•'irby,'" "A Comparison of Scores Obtained on Standardized Oral 
and Silent Heading Tests and a Cloze Tegf (unpublished doctoral dissertation 
)3all State University, 1?67), p/82. 

SiiLeo Kirchrioff, "A Study " Utilizing the Cloze. Test Procedure to Determine 
Fleading Levels of First Grade Children VJho Have Been Taught. Beginning Reading 
by Four rdf f erent Approaches^'o (unpublished doctoral dissertation, University 
of ."lansas, 196d). . 



Based upon findings of research thus far attempted, the future looks 
• promising for the close procedure as a method for determing instructional levels 



■, '[ .. 

Suimnary of . Cloze Procedure" 
The use ^ of the cloze procedure as a method of determining .relative read- 
ability was first introduced by* Wilson Taylor/, who contended that the- cloze 
procedure was a superior method of determining readability since it measured 
a facet of readability called concept load. -Taylor's research supported his'", 
contention that other readability methods, namely the various readability 
formulae, cou3.d be^ fooled by, literature containing 'a low vocabulary load with 
a high concept load. Authorities have accepted cloze tests as valid and 
reliable instruments capable, of measuring general comprehension skills. , 

V . / - 

^ •'' ' t ■ 

Agreement now exists as to the accepted fneihodology of cloze test construc- 



tion.^ . A deletion system of every fifth" word i-rith standardized blank length 
has been accepted, although recent studies have submitted that blank length 
is not crucial to the construction of a cloze test. Reliability coefficients 
are highest when cloze patssages consist of at least 2^0 words. Scoring cloze 



tests by the exact-word-response .is the most valid method and eliminates any 



subjective .judgment on the part of the examiner© Raw cloze scores are converted 
■ to percenta-^e scores to facilitate comparisons among passages of different . 
lengths. '^Research studies have demonstrated that' a cloze score of 38 per cent 
is comparable : to a score of 7i? per cent as measured by tradition^ types of 



comprehension -tasks. If a comparison with multiple chbice tests is desired, 

the spore, corrected for guessing, approaches hk per cent* The*^e scores are 

indicative of the pupils' instructional levels; -scores of 50 and 57 per cent 

(corrected for guessing) are comparable to the. independent level of 90 per. cent 

'as measured by traditional types of comprehension tasks* 

^ . ■ . ■ . ' . ,i 

Three studies indicated the suitability of the cloze procedure as a method 



"of" determining the inis true tional levels of -elementary school pupils. 

85,X6r 87. .• 

:.. . ^: ■ ■ ' .- ■ \ 

In revievjihg the* literature^ many research weaknesses become apparent." \^ 
Many researchers failed to mention the" procedures which were employed in cloze ' 
test construction. Deletion, systems and scoring procedures are often described 
inadequately. The conceptual difficulty of; the cloze "materials is rarely mentioned 
nor are the readability levels' of materials used or the sources of the materials. 
■In studies where close tests are corre]^ated wath questions, the 

nature of the questions j, their validity, and their de.velopment are not described.- 

Serious sampling errors exist. How subjects are selected is rarely reyealed, 
nor^are the actual reading abilities of these subjects given. This latter . 
point is absolutely es:3ential to the interpretation of performance on a cloze 
test. 

. More needs to be knoTrm about the construct validity of cloze test. 'Several 
affective - components operating during^'^^the completion ■ of \a cloze-, test also need 
to be investigated. The effects of author's style, passage content which is • 
not congruent TOLth the subject's personal belie.fs, and dialectal and syntactical 

deviations from the subjects' speech patterns should be' investigated* . ' 

- . ^ ' . ■. ^' ■ ■ \ \ ■ . • , 

■ -The cloze procedure has contributed^ a-great deal -to the areas of both • 

rea;ding<'aad language. It is a promising technique and, ^d.th the advent' of 



future research, it can 'prove^ to be^^-^an invaluable too. 

■ . . ■ ■ ■ A 



^'%ansomV loc. cit. 



^Kirby^ vlQc; cit. 
^7l\irchh6ff , ' loc. cit. 



